The motivation of this research was to evaluate the main memory performance of a hybrid\nsuper computer such as the Convey HC-x, and ascertain how the controller performs in several access\nscenarios, vis-� -vis hand-coded memory prefetches. Such memory patterns are very useful in stencil\ncomputations. The theoretical bandwidth of the memory of the Convey is compared with the results\nof our measurements. The accurate study of the memory subsystem is particularly useful for users\nwhen they are developing their application-specific personality. Experiments were performed to\nmeasure the bandwidth between the coprocessor and the memory subsystem. The experiments aimed\nmainly at measuring the reading access speed of the memory from Application Engines (FPGAs).\nDifferent ways of accessing data were used in order to find the most efficient way to access memory.\nThis way was proposed for future work in the Convey HC-x. When performing a series of accesses to\nmemory, non-uniform latencies occur. The Memory Controller of the Convey HC-x in the coprocessor\nattempts to cover this latency. We measure memory efficiency as a ratio of the number of memory\naccesses and the number of execution cycles. The result of this measurement converges to one in most\ncases. In addition, we performed experiments with hand-coded memory accesses. The analysis of the\nexperimental results shows how the memory subsystem and Memory Controllers work. From this\nwork we conclude that the memory controllers do an excellent job, largely because (transparently\nto the user) they seem to cache large amounts of data, and hence hand-coding is not needed in\nmost situations.
Loading....